49 research outputs found

    Image Analysis Enhanced Event Detection from Geo-tagged Tweet Streams

    Full text link
    Events detected from social media streams often include early signs of accidents, crimes or disasters. Therefore, they can be used by related parties for timely and efficient response. Although significant progress has been made on event detection from tweet streams, most existing methods have not considered the posted images in tweets, which provide richer information than the text, and potentially can be a reliable indicator of whether an event occurs or not. In this paper, we design an event detection algorithm that combines textual, statistical and image information, following an unsupervised machine learning approach. Specifically, the algorithm starts with semantic and statistical analyses to obtain a list of tweet clusters, each of which corresponds to an event candidate, and then performs image analysis to separate events from non-events---a convolutional autoencoder is trained for each cluster as an anomaly detector, where a part of the images are used as the training data and the remaining images are used as the test instances. Our experiments on multiple datasets verify that when an event occurs, the mean reconstruction errors of the training and test images are much closer, compared with the case where the candidate is a non-event cluster. Based on this finding, the algorithm rejects a candidate if the difference is larger than a threshold. Experimental results over millions of tweets demonstrate that this image analysis enhanced approach can significantly increase the precision with minimum impact on the recall.Comment: 12 pages, 4 figure

    Detection of variants in dystroglycanopathy-associated genes through the application of targeted whole-exome sequencing analysis to a large cohort of patients with unexplained limb-girdle muscle weakness

    Get PDF
    Background: Dystroglycanopathies are a clinically and genetically heterogeneous group of disorders that are typically characterised by limb-girdle muscle weakness. Mutations in 18 different genes have been associated with dystroglycanopathies, the encoded proteins of which typically modulate the binding of alpha-dystroglycan to extracellular matrix ligands by altering its glycosylation. This results in a disruption of the structural integrity of the myocyte, ultimately leading to muscle degeneration. Methods: Deep phenotypic information was gathered using the PhenoTips online software for 1001 patients with unexplained limb-girdle muscle weakness from 43 different centres across 21 European and Middle Eastern countries. Whole-exome sequencing with at least 250 ng DNA was completed using an Illumina exome capture and a 38 Mb baited target. Genes known to be associated with dystroglycanopathies were analysed for disease-causing variants. Results: Suspected pathogenic variants were detected in DPM3, ISPD, POMT1 and FKTN in one patient each, in POMK in two patients, in GMPPB in three patients, in FKRP in eight patients and in POMT2 in ten patients. This indicated a frequency of 2.7% for the disease group within the cohort of 1001 patients with unexplained limb-girdle muscle weakness. The phenotypes of the 27 patients were highly variable, yet with a fundamental presentation of proximal muscle weakness and elevated serum creatine kinase. Conclusions: Overall, we have identified 27 patients with suspected pathogenic variants in dystroglycanopathy-associated genes. We present evidence for the genetic and phenotypic diversity of the dystroglycanopathies as a disease group, while also highlighting the advantage of incorporating next-generation sequencing into the diagnostic pathway of rare diseases.Peer reviewe

    Detection of variants in dystroglycanopathy-associated genes through the application of targeted whole-exome sequencing analysis to a large cohort of patients with unexplained limb-girdle muscle weakness

    Get PDF
    Abstract Background Dystroglycanopathies are a clinically and genetically heterogeneous group of disorders that are typically characterised by limb-girdle muscle weakness. Mutations in 18 different genes have been associated with dystroglycanopathies, the encoded proteins of which typically modulate the binding of α-dystroglycan to extracellular matrix ligands by altering its glycosylation. This results in a disruption of the structural integrity of the myocyte, ultimately leading to muscle degeneration. Methods Deep phenotypic information was gathered using the PhenoTips online software for 1001 patients with unexplained limb-girdle muscle weakness from 43 different centres across 21 European and Middle Eastern countries. Whole-exome sequencing with at least 250 ng DNA was completed using an Illumina exome capture and a 38 Mb baited target. Genes known to be associated with dystroglycanopathies were analysed for disease-causing variants. Results Suspected pathogenic variants were detected in DPM3, ISPD, POMT1 and FKTN in one patient each, in POMK in two patients, in GMPPB in three patients, in FKRP in eight patients and in POMT2 in ten patients. This indicated a frequency of 2.7% for the disease group within the cohort of 1001 patients with unexplained limb-girdle muscle weakness. The phenotypes of the 27 patients were highly variable, yet with a fundamental presentation of proximal muscle weakness and elevated serum creatine kinase. Conclusions Overall, we have identified 27 patients with suspected pathogenic variants in dystroglycanopathy-associated genes. We present evidence for the genetic and phenotypic diversity of the dystroglycanopathies as a disease group, while also highlighting the advantage of incorporating next-generation sequencing into the diagnostic pathway of rare diseases

    A UI prototype for emotion-based event detection in the live web

    No full text
    Microblogging platforms are at the core of what is known as the Live Web: the most dynamic, and fast changing portion of the web, where content is generated constantly by the users, in snippets of information. Therefore, the Live Web (or Now Web) is a good source of information for event detection, because it reflects what is happening in the physical world in a timely manner. Meanwhile, it introduces constraints and challenges: large volumes of unstructured, noisy data, which are also as diverse as the users and their interests. In this work we present a prototype User Interface (UI) of our TwInsight system, which deals with event detection of real-world phenomena from microblogs. Our system applies i) emotion extraction techniques on microblogs, and ii) location extraction techniques on user profiles. Combining these two, we convert highly unstructured content to thematically enriched, locational information, which we present to the user through a unified front-end. A separate area of the UI is used to show events to the user, as they are identified. Taking into account the characteristics of the setting, all of the components are updated along the temporal dimension. We discuss each part of our UI in detail, and present anecdotal evidence of its operation through two real-life event examples. © 2013 Springer-Verlag

    Efficient and adaptive distributed skyline computation

    No full text
    Skyline queries have attracted considerable attention over the last few years, mainly due to their ability to return interesting objects without the need for user-defined scoring functions. In this work, we study the problem of distributed skyline computation and propose an adaptive algorithm towards controlling the degree of parallelism and the required network traffic. In contrast to state-of-the-art methods, our algorithm handles efficiently diverse preferences imposed on attributes. The key idea is to partition the data using a grid scheme and for each query to build on-the-fly a dependency graph among partitions which can help in effective pruning. Our algorithm operates in two modes: (i) full-parallel mode, where processors are activated simultaneously or (ii) cascading mode, where processors are activated in a cascading manner using propagation of intermediate results, thus reducing network traffic and potentially increasing throughput. Performance evaluation results, based on real-life and synthetic data sets, demonstrate the scalability with respect to the number of processors and database size. © 2010 Springer-Verlag Berlin Heidelberg

    Debugging applications created by a Domain Specific Language: The IPAC case

    No full text
    Nowadays, software developers have created a large number of applications in various research domains of Computer Science. However, not all of them are familiar with the majority of the research domains. Hence, Domain Specific Languages (DSLs) can provide an abstract, concrete description of a domain in terms that can easily be managed by developers. The most important in such cases is the provision of a debugger for debugging the generated software based on a specific DSL. In this paper, we propose and present a simple but efficient debugger created for the needs of the IPAC system. The debugger is able to provide debugging facilities to developers that define applications for autonomous mobile nodes. The debugger can map code lines between the initial application workflow and the final code defined in a known programming language. Finally, we propose a logging server responsible to provide debugging facilities for the IPAC framework. The IPAC system is consisted of a number of middleware services for mobile nodes acting in a network. In this system a number of mobile nodes exchanged messages that are visualized for more efficient manipulation. © 2011 Elsevier Inc. All rights reserved

    A faceted crawler for the Twitter service

    No full text
    Researchers, nowadays, have at their disposal valuable data from social networking applications, of which Twitter and Facebook are the most prominent examples. To retrieve this content, the Twitter service provides 2 distinct Application Programming Interfaces (APIs): a probe-based and a streaming one, each of which imposes different limitations on the data collection process. In this paper, we present a general architecture to facilitate faceted crawling of the service, which simplifies retrieval. We give implementation details of our system, while providing a simple way to express the crawling process, i.e., the crawl flow. We experimentally evaluate it on a variety of faceted crawls, depicting its efficacy for the online medium. © Springer International Publishing Switzerland 2014

    Mining Competitors from Large Unstructured Datasets

    No full text
    In any competitive business, success is based on the ability to make an item more appealing to customers than the competition. A number of questions arise in the context of this task: how do we formalize and quantify the competitiveness between two items? Who are the main competitors of a given item? What are the features of an item that most affect its competitiveness? Despite the impact and relevance of this problem to many domains, only a limited amount of work has been devoted toward an effective solution. In this paper, we present a formal definition of the competitiveness between two items, based on the market segments that they can both cover. Our evaluation of competitiveness utilizes customer reviews, an abundant source of information that is available in a wide range of domains. We present efficient methods for evaluating competitiveness in large review datasets and address the natural problem of finding the top-k competitors of a given item. Finally, we evaluate the quality of our results and the scalability of our approach using multiple datasets from different domains. © 1989-2012 IEEE

    Compost production from Greek domestic refuse

    No full text

    Adaptive Processing of Multi-Criteria Decision Support Queries

    No full text
    corecore